{
 "cells": [
  {
   "cell_type": "markdown",
   "id": "1004e8c5",
   "metadata": {},
   "source": [
    "## Preparing a Factor Ranking Model Using Zipline Pipelines"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "id": "0f9240f8",
   "metadata": {},
   "outputs": [],
   "source": [
    "import os\n",
    "import warnings"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "id": "b7692a0e",
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np\n",
    "import pandas as pd\n",
    "from IPython.display import Markdown, display\n",
    "from zipline.data import bundles\n",
    "from zipline.data.bundles.core import load\n",
    "from zipline.pipeline import Pipeline\n",
    "from zipline.pipeline.data import USEquityPricing\n",
    "from zipline.pipeline.engine import SimplePipelineEngine\n",
    "from zipline.pipeline.factors import (\n",
    "    VWAP,\n",
    "    AnnualizedVolatility,\n",
    "    AverageDollarVolume,\n",
    "    BollingerBands,\n",
    "    CustomFactor,\n",
    "    DailyReturns,\n",
    "    ExponentialWeightedMovingAverage,\n",
    "    MaxDrawdown,\n",
    "    PercentChange,\n",
    "    Returns,\n",
    "    SimpleMovingAverage,\n",
    "    WeightedAverageValue,\n",
    ")\n",
    "from zipline.pipeline.loaders import USEquityPricingLoader"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "id": "cd6b77d0",
   "metadata": {},
   "outputs": [],
   "source": [
    "warnings.filterwarnings(\"ignore\")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "010eb64b",
   "metadata": {},
   "source": [
    "### Option 1: Use the built-in bundle with free data\n",
    "\n",
    "This option uses the built-in data bundle provided by Zipline. It then acquires free US equities data that extend through 2018."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "a981f03d",
   "metadata": {},
   "outputs": [],
   "source": [
    "os.environ[\"QUANDL_API_KEY\"] = \"YOUR_API_KEY\"\n",
    "bundle = \"quandl\"\n",
    "bundles.ingest(bundle)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "b77845f5",
   "metadata": {},
   "source": [
    "### Option 2: Use the custom bundle with premium data\n",
    "\n",
    "This option uses the custom bundle with premium data. Follow the steps here: https://pyquantnews.com/ingest-premium-market-data-with-zipline-reloaded/ before using."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "1634cf05",
   "metadata": {},
   "outputs": [],
   "source": [
    "os.environ[\"DATALINK_API_KEY\"] = \"YOUR_API_KEY\"\n",
    "bundle = \"quotemedia\"\n",
    "load_extensions(True, [], False, os.environ)\n",
    "bundles.ingest(bundle)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "c1535763-cc66-47b3-9f0b-c2e32859ca82",
   "metadata": {},
   "source": [
    "Ingest the bundle data from your selected bundle."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "id": "f15952fe-6ca1-4a1b-b624-56f83e535f8b",
   "metadata": {},
   "outputs": [],
   "source": [
    "bundle_data = load(bundle, os.environ, None)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "a5229994",
   "metadata": {},
   "source": [
    "Create a USEquityPricingLoader"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "id": "80e9a003",
   "metadata": {},
   "outputs": [],
   "source": [
    "pipeline_loader = USEquityPricingLoader(\n",
    "    bundle_data.equity_daily_bar_reader, bundle_data.adjustment_reader, fx_reader=None\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "9b6ad050",
   "metadata": {},
   "source": [
    "Initialize a SimplePipelineEngine"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "id": "7b9758e6",
   "metadata": {},
   "outputs": [],
   "source": [
    "engine = SimplePipelineEngine(\n",
    "    get_loader=lambda col: pipeline_loader, asset_finder=bundle_data.asset_finder\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "065e4f3e",
   "metadata": {
    "lines_to_next_cell": 2
   },
   "source": [
    "Define a custom momentum factor"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "id": "e67c31b0",
   "metadata": {
    "lines_to_next_cell": 2
   },
   "outputs": [],
   "source": [
    "class MomentumFactor(CustomFactor):\n",
    "    \"\"\"Momentum factor\"\"\"\n",
    "\n",
    "    inputs = [USEquityPricing.close, Returns(window_length=126)]\n",
    "    window_length = 252\n",
    "\n",
    "    def compute(self, today, assets, out, prices, returns):\n",
    "        out[:] = (\n",
    "            (prices[-21] - prices[-252]) / prices[-252]\n",
    "            - (prices[-1] - prices[-21]) / prices[-21]\n",
    "        ) / np.nanstd(returns, axis=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "6f1bf5b2",
   "metadata": {
    "lines_to_next_cell": 2
   },
   "source": [
    "Define a function to create a pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "id": "8d58e82f",
   "metadata": {},
   "outputs": [],
   "source": [
    "def make_pipeline():\n",
    "    momentum = MomentumFactor()\n",
    "    dollar_volume = AverageDollarVolume(window_length=30)\n",
    "\n",
    "    return Pipeline(\n",
    "        columns={\n",
    "            \"factor\": momentum,\n",
    "            \"longs\": momentum.top(50),\n",
    "            \"shorts\": momentum.bottom(50),\n",
    "            \"rank\": momentum.rank(),\n",
    "        },\n",
    "        screen=dollar_volume.top(100),\n",
    "    )"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "ee254dcc",
   "metadata": {},
   "source": [
    "Run the pipeline"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "id": "e6bc5acc",
   "metadata": {},
   "outputs": [],
   "source": [
    "results = engine.run_pipeline(\n",
    "    make_pipeline(), pd.to_datetime(\"2012-01-04\"), pd.to_datetime(\"2012-03-01\")\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "90a2fdc3",
   "metadata": {},
   "source": [
    "Clean and display the results"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "id": "36fa26b8",
   "metadata": {},
   "outputs": [],
   "source": [
    "results.dropna(subset=\"factor\", inplace=True)\n",
    "results.index.names = [\"date\", \"symbol\"]\n",
    "results.sort_values(by=[\"date\", \"factor\"], inplace=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "id": "f0263cb9",
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th>factor</th>\n",
       "      <th>longs</th>\n",
       "      <th>shorts</th>\n",
       "      <th>rank</th>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>date</th>\n",
       "      <th>symbol</th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "      <th></th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th rowspan=\"5\" valign=\"top\">2012-01-04</th>\n",
       "      <th>Equity(300 [BAC])</th>\n",
       "      <td>-2.522045</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>165.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Equity(1264 [GS])</th>\n",
       "      <td>-2.215784</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>220.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Equity(1888 [MS])</th>\n",
       "      <td>-2.204802</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>225.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Equity(1894 [MSFT])</th>\n",
       "      <td>-1.949654</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>295.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Equity(457 [C])</th>\n",
       "      <td>-1.830819</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>345.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>...</th>\n",
       "      <th>...</th>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "      <td>...</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th rowspan=\"5\" valign=\"top\">2012-03-01</th>\n",
       "      <th>Equity(3105 [WMT])</th>\n",
       "      <td>3.409414</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>2607.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Equity(1690 [LLY])</th>\n",
       "      <td>3.809608</td>\n",
       "      <td>False</td>\n",
       "      <td>False</td>\n",
       "      <td>2642.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Equity(399 [BMY])</th>\n",
       "      <td>4.689588</td>\n",
       "      <td>True</td>\n",
       "      <td>False</td>\n",
       "      <td>2685.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Equity(1770 [MCD])</th>\n",
       "      <td>4.816880</td>\n",
       "      <td>True</td>\n",
       "      <td>False</td>\n",
       "      <td>2691.0</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>Equity(1789 [MDLZ])</th>\n",
       "      <td>5.680276</td>\n",
       "      <td>True</td>\n",
       "      <td>False</td>\n",
       "      <td>2706.0</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "<p>4000 rows × 4 columns</p>\n",
       "</div>"
      ],
      "text/plain": [
       "                                  factor  longs  shorts    rank\n",
       "date       symbol                                              \n",
       "2012-01-04 Equity(300 [BAC])   -2.522045  False   False   165.0\n",
       "           Equity(1264 [GS])   -2.215784  False   False   220.0\n",
       "           Equity(1888 [MS])   -2.204802  False   False   225.0\n",
       "           Equity(1894 [MSFT]) -1.949654  False   False   295.0\n",
       "           Equity(457 [C])     -1.830819  False   False   345.0\n",
       "...                                  ...    ...     ...     ...\n",
       "2012-03-01 Equity(3105 [WMT])   3.409414  False   False  2607.0\n",
       "           Equity(1690 [LLY])   3.809608  False   False  2642.0\n",
       "           Equity(399 [BMY])    4.689588   True   False  2685.0\n",
       "           Equity(1770 [MCD])   4.816880   True   False  2691.0\n",
       "           Equity(1789 [MDLZ])  5.680276   True   False  2706.0\n",
       "\n",
       "[4000 rows x 4 columns]"
      ]
     },
     "metadata": {},
     "output_type": "display_data"
    }
   ],
   "source": [
    "display(results)"
   ]
  },
  {
   "cell_type": "markdown",
   "id": "8edd7c30",
   "metadata": {},
   "source": [
    "**Jason Strimpel** is the founder of <a href='https://pyquantnews.com/'>PyQuant News</a> and co-founder of <a href='https://www.tradeblotter.io/'>Trade Blotter</a>. His career in algorithmic trading spans 20+ years. He previously traded for a Chicago-based hedge fund, was a risk manager at JPMorgan, and managed production risk technology for an energy derivatives trading firm in London. In Singapore, he served as APAC CIO for an agricultural trading firm and built the data science team for a global metals trading firm. Jason holds degrees in Finance and Economics and a Master's in Quantitative Finance from the Illinois Institute of Technology. His career spans America, Europe, and Asia. He shares his expertise through the <a href='https://pyquantnews.com/subscribe-to-the-pyquant-newsletter/'>PyQuant Newsletter</a>, social media, and has taught over 1,000+ algorithmic trading with Python in his popular course **<a href='https://gettingstartedwithpythonforquantfinance.com/'>Getting Started With Python for Quant Finance</a>**. All code is for educational purposes only. Nothing provided here is financial advise. Use at your own risk."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "id": "8334f85c-3286-4e6c-b77b-bc107fde21f5",
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "jupytext": {
   "cell_metadata_filter": "-all",
   "main_language": "python",
   "notebook_metadata_filter": "-all"
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.13"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 5
}
